The following

outline Outline or outlining may refer to: * Outline (list), a document summary, in hierarchical list format * Code folding, a method of hiding or collapsing code or text to see content in outline form * Outline drawing, a sketch depicting the outer edge ...

is provided as an overview of and topical guide to machine learning.

Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

is a subfield of

soft computing Soft computing is a set of algorithms, including neural networks, fuzzy logic, and evolutionary algorithms. These algorithms are tolerant of imprecision, uncertainty, partial truth and approximation. It is contrasted with hard computing: al ...

within

computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

that evolved from the study of

pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphi ...

and

computational learning theory In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms. Overview Theoretical results in machine learning m ...

artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...

.http://www.britannica.com/EBchecked/topic/1116194/machine-learning In 1959,

Arthur Samuel Arthur Lee Samuel (December 5, 1901 – July 29, 1990) was an American pioneer in the field of computer gaming and artificial intelligence. He popularized the term "machine learning" in 1959. The Samuel Checkers-playing Program was among the wo ...

defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...

s that can

learn Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; there is also evidence for some kind of learnin ...

from and make predictions on

data In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted ...

. Such algorithms operate by building a

model A model is an informative representation of an object, person or system. The term originally denoted the Plan_(drawing), plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a mea ...

from an example

training set In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from ...

of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

What ''type'' of thing is machine learning?

* An

academic discipline An academy ( Attic Greek: Ἀκαδήμεια; Koine Greek Ἀκαδημία) is an institution of secondary or tertiary higher learning (and generally also research or honorary membership). The name traces back to Plato's school of philosophy ...

* A branch of

science Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earliest archeological evidence for ...

** An

applied science Applied science is the use of the scientific method and knowledge obtained via conclusions from the method to attain practical goals. It includes a broad range of disciplines such as engineering and medicine. Applied science is often contrasted ...

*** A subfield of

**** A branch of

**** A subfield of soft computing *** Application of

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

Branches of machine learning

Subfields of machine learning

Computational learning theory In computer science, computational learning theory (or just learning theory) is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms. Overview Theoretical results in machine learning m ...

– studying the design and analysis of

machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

algorithms. *

Grammar induction Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of ''re-write rules'' or '' productions'' or alternatively as a finite state machine or automaton of some kind) fr ...

Meta-learning Meta-learning is a branch of metacognition concerned with learning about one's own learning and learning processes. The term comes from the meta prefix's modern meaning of an abstract recursion, or "X about X", similar to its use in metaknowled ...

Cross-disciplinary fields involving machine learning

Adversarial machine learning Adversarial machine learning is the study of the attacks on machine learning algorithms, and of the defenses against such attacks. A recent survey exposes the fact that practitioners report a dire need for better protecting machine learning syste ...

Predictive analytics Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or otherwise unknown events. In business ...

Quantum machine learning Quantum machine learning is the integration of quantum algorithms within machine learning programs. The most common use of the term refers to machine learning algorithms for the analysis of classical data executed on a quantum computer, i.e. quan ...

Robot learning Robot learning is a research field at the intersection of machine learning and robotics. It studies techniques allowing a robot to acquire novel skills or adapt to its environment through learning algorithms. The embodiment of the robot, situated in ...

Developmental robotics Developmental robotics (DevRob), sometimes called epigenetic robotics, is a scientific field which aims at studying the developmental mechanisms, architectures and constraints that allow lifelong and open-ended learning of new skills and new knowle ...

Applications of machine learning

Applications of machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

Bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...

Biomedical informatics Health informatics is the field of science and engineering that aims at developing methods and technologies for the acquisition, processing, and study of patient data, which can come from different sources and modalities, such as electronic hea ...

Computer vision Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...

Customer relationship management Customer relationship management (CRM) is a process in which a business or other organization administers its interactions with customers, typically using data analysis to study large amounts of information. CRM systems compile data from a ra ...

– * Data mining *

Earth sciences Earth science or geoscience includes all fields of natural science related to the planet Earth. This is a branch of science dealing with the physical, chemical, and biological complex constitutions and synergistic linkages of Earth's four sphere ...

Email filtering Email filtering is the processing of email to organize it according to specified criteria. The term can apply to the intervention of human intelligence, but most often refers to the automatic processing of messages at an SMTP server, possibly appl ...

Inverted pendulum An inverted pendulum is a pendulum that has its center of mass above its pivot point. It is unstable and without additional help will fall over. It can be suspended stably in this inverted position by using a control system to monitor the angle ...

– balance and equilibrium system. *

Natural language processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to pro ...

(NLP) **

Named Entity Recognition Named-entity recognition (NER) (also known as (named) entity identification, entity chunking, and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre ...

Automatic summarization Automatic summarization is the process of shortening a set of data computationally, to create a subset (a summary) that represents the most important or relevant information within the original content. Artificial intelligence algorithms are commo ...

Automatic taxonomy construction Automatic taxonomy construction (ATC) is the use of software programs to generate taxonomical classifications from a body of texts called a corpus. ATC is a branch of natural language processing, which in turn is a branch of artificial intelligence ...

Dialog system A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and o ...

Grammar checker A grammar checker, in computing terms, is a program, or part of a program, that attempts to verify written text for grammatical correctness. Grammar checkers are most often implemented as a feature of a larger program, such as a word processor, b ...

** Language recognition ***

Handwriting recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other dev ...

***

Optical character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...

***

Speech recognition Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers with the m ...

**** Text to Speech Synthesis (TTS) ****

Speech Emotion Recognition Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While some ...

(SER) **

Machine translation Machine translation, sometimes referred to by the abbreviation MT (not to be confused with computer-aided translation, machine-aided human translation or interactive translation), is a sub-field of computational linguistics that investigates t ...

Question answering Question answering (QA) is a computer science discipline within the fields of information retrieval and natural language processing (NLP), which is concerned with building systems that automatically answer questions posed by humans in a natural l ...

Speech synthesis Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal languag ...

Text mining Text mining, also referred to as ''text data mining'', similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extract ...

***

Term frequency–inverse document frequency Term may refer to: *Terminology, or term, a noun or compound word used in a specific context, in particular: **Technical term, part of the specialized vocabulary of a particular field, specifically: ***Scientific terminology, terms used by scienti ...

(tf–idf) **

Text simplification Text simplification is an operation used in natural language processing to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying meaning and ...

Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphi ...

Facial recognition system A facial recognition system is a technology capable of matching a human face from a digital image or a video frame against a database of faces. Such a system is typically employed to authenticate users through ID verification services, and wo ...

Image recognition Computer vision is an interdisciplinary scientific field that deals with how computers can gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to understand and automate tasks that the hum ...

Recommendation system A recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular u ...

Collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...

Content-based filtering A recommender system, or a recommendation system (sometimes replacing 'system' with a synonym such as platform or engine), is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular u ...

** Hybrid recommender systems (Collaborative and content-based filtering) *

Search engine A search engine is a software system designed to carry out web searches. They search the World Wide Web in a systematic way for particular information specified in a textual web search query. The search results are generally presented in a ...

Search engine optimization Search engine optimization (SEO) is the process of improving the quality and quantity of Web traffic, website traffic to a website or a web page from web search engine, search engines. SEO targets unpaid traffic (known as "natural" or "Organ ...

Social Engineering Social engineering may refer to: * Social engineering (political science), a means of influencing particular attitudes and social behaviors on a large scale * Social engineering (security), obtaining confidential information by manipulating and/or ...

Machine learning hardware

Graphics processing unit A graphics processing unit (GPU) is a specialized electronic circuit designed to manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. GPUs are used in embedded systems, mobi ...

Tensor processing unit Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google for Artificial neural network, neural network machine learning, using Google's own TensorFlow software. Google began using TPUs ...

Vision processing unit A vision processing unit (VPU) is (as of 2018) an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks. Overview Vision processing units are distinct from video processing units ...

Machine learning tools

Comparison of deep learning software The following table compares notable software frameworks, libraries and computer programs for deep learning. Deep-learning software by name Comparison of compatibility of machine learning models See also *Comparison of numerical-anal ...

Machine learning frameworks

Proprietary machine learning frameworks

Amazon Machine Learning Amazon Web Services, Inc. (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis. These cloud computing web services provide di ...

* Microsoft Azure Machine Learning Studio * DistBelief – replaced by TensorFlow

Open source machine learning frameworks

* Apache Singa * Apache MXNet * Caffe *

PyTorch PyTorch is a machine learning framework based on the Torch library, used for applications such as computer vision and natural language processing, originally developed by Meta AI and now part of the Linux Foundation umbrella. It is free and open ...

mlpack mlpack is a machine learning software library for C++, built on top of the Armadillo library and thensmallennumerical optimization library. mlpack has an emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possib ...

TensorFlow TensorFlow is a free and open-source software library for machine learning and artificial intelligence. It can be used across a range of tasks but has a particular focus on training and inference of deep neural networks. "It is machine learning ...

Torch A torch is a stick with combustible material at one end, which is ignited and used as a light source. Torches have been used throughout history, and are still used in processions, symbolic and religious events, and in juggling entertainment. In ...

* CNTK *

Accord.Net Accord.NET is a framework for scientific computing in .NET. The source code of the project is available under the terms of the Gnu Lesser Public License, version 2.1. The framework comprises a set of libraries that are available in source code as ...

* Jax

Machine learning libraries

Deeplearning4j Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, d ...

Theano In Greek mythology, Theano (; Ancient Greek: Θεανώ) may refer to the following personages: *Theano, wife of Metapontus, king of Icaria. Metapontus demanded that she bear him children, or leave the kingdom. She presented the children of Melan ...

scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector m ...

Keras Keras is an open-source software library that provides a Python interface for artificial neural networks. Keras acts as an interface for the TensorFlow library. Up until version 2.3, Keras supported multiple backends, including TensorFlow, Micro ...

Machine learning algorithms

Almeida–Pineda recurrent backpropagation Almeida–Pineda recurrent backpropagation is an extension to the backpropagation algorithm that is applicable to recurrent neural networks. It is a type of supervised learning. It was described somewhat cryptically in Richard Phillips Feynman, Rich ...

ALOPEX Alopex may refer to: * ''Alopex lagopus'', a taxonomic synonym for the Arctic fox, ''Vulpes lagopus'' * ALOPEX a correlation-based machine learning algorithm * Alopex (Teenage Mutant Ninja Turtles), a character in the ''Teenage Mutant Ninja Turt ...

Backpropagation In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward neural network, feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANN ...

Bootstrap aggregating Bootstrap aggregating, also called bagging (from bootstrap aggregating), is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regress ...

CN2 algorithm The CN2 induction algorithm is a learning algorithm for rule induction Rule induction is an area of machine learning in which formal rules are extracted from a set of observations. The rules extracted may represent a full scientific model of the ...

Constructing skill trees Constructing skill trees (CST) is a hierarchical reinforcement learning algorithm which can build skill trees from a set of sample solution trajectories obtained from demonstration. CST uses an incremental MAP (maximum a posteriori) change point d ...

* Dehaene–Changeux model *

Diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from ...

Dominance-based rough set approach The dominance-based rough set approach (DRSA) is an extension of rough set theory for multi-criteria decision analysis (MCDA), introduced by Greco, Matarazzo and Słowiński. Greco, S., Matarazzo, B., Słowiński, R.: Rough sets theory for multi- ...

Dynamic time warping In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walki ...

Error-driven learning Error-driven learning is a sub-area of machine learning concerned with how an Intelligent agent, agent ought to take actions in an Environment (biophysical), environment so as to minimize some error feedback. It is a type of reinforcement learning ...

* Evolutionary multimodal optimization *

Expectation–maximization algorithm In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variabl ...

FastICA FastICA is an efficient and popular algorithm for independent component analysis invented by Aapo Hyvärinen at Helsinki University of Technology. Like most ICA algorithms, FastICA seeks an orthogonal rotation of prewhitened data, through a fixed- ...

Forward–backward algorithm The forward–backward algorithm is an inference algorithm for hidden Markov models which computes the posterior marginals of all hidden state variables given a sequence of observations/emissions o_:= o_1,\dots,o_T, i.e. it computes, for all h ...

GeneRec GeneRec is a generalization of the recirculation algorithm, and approximates Almeida-Pineda recurrent backpropagation.O'Reilly, R.C. Biologically Plausible Error-driven Learning using Local Activation Differences: The Generalized Recirculation Algo ...

Genetic Algorithm for Rule Set Production Genetic Algorithm for Rule Set Production (GARP) is a computer program based on genetic algorithm that creates Environmental niche modelling, ecological niche models for species. The generated models describe environmental conditions (precipitation ...

Growing self-organizing map A growing self-organizing map (GSOM) is a growing variant of a self-organizing map (SOM). The GSOM was developed to address the issue of identifying a suitable map size in the SOM. It starts with a minimal number of nodes (usually 4) and grows new ...

* Hyper basis function network * IDistance *

K-nearest neighbors algorithm In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and regres ...

Kernel methods for vector output Kernel methods are a well-established tool to analyze the relationship between input data and the corresponding output of a function. Kernels encapsulate the properties of functions in a Kernel trick, computationally efficient way and allow algorith ...

Kernel principal component analysis In the field of multivariate statistics, kernel principal component analysis (kernel PCA) is an extension of principal component analysis (PCA) using techniques of kernel methods. Using a kernel, the originally linear operations of PCA are perfor ...

Leabra Leabra stands for local, error-driven and associative, biologically realistic algorithm. It is a model of learning which is a balance between Hebbian and error-driven learning with other network-derived characteristics. This model is used to mathem ...

Linde–Buzo–Gray algorithm The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook. It is similar to the k-means method in data clustering. The algorithm At each ...

Local outlier factor In anomaly detection, the local outlier factor (LOF) is an algorithm proposed by Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng and Jörg Sander in 2000 for finding anomalous data points by measuring the local deviation of a given data poin ...

* Logic learning machine *

LogitBoost In machine learning and computational learning theory, LogitBoost is a boosting algorithm formulated by Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The original paper casts the AdaBoost algorithm into a statistical framework. Specific ...

Manifold alignment Manifold alignment is a class of machine learning algorithms that produce projections between sets of data, given that the original data sets lie on a common manifold. The concept was first introduced as such by Ham, Lee, and Saul in 2003, adding ...

* Markov chain Monte Carlo (MCMC) *

Minimum redundancy feature selection Minimum redundancy feature selection is an algorithm frequently used in a method to accurately identify characteristics of genes and phenotypes and narrow down their relevance and is usually described in its pairing with relevant feature selection a ...

Mixture of experts Mixture of experts (MoE) refers to a machine learning technique where multiple expert networks (learners) are used to divide a problem space into homogeneous regions. It differs from ensemble techniques in that typically only a few, or 1, expert mo ...

* Multiple kernel learning *

Non-negative matrix factorization Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property that ...

Online machine learning In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques whic ...

Out-of-bag error Out-of-bag (OOB) error, also called out-of-bag estimate, is a method of measuring the prediction error of random forests, gradient boosting, boosted decision trees, and other machine learning models utilizing bootstrap aggregating (bagging). Baggi ...

* Prefrontal cortex basal ganglia working memory *

PVLV The primary value learned value (PVLV) model is a possible explanation for the reward-predictive firing properties of dopamine (DA) neurons. It simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons tha ...

Q-learning ''Q''-learning is a model-free reinforcement learning algorithm to learn the value of an action in a particular state. It does not require a model of the environment (hence "model-free"), and it can handle problems with stochastic transitions an ...

Quadratic unconstrained binary optimization Quadratic unconstrained binary optimization (QUBO), also known as unconstrained binary quadratic programming (UBQP), is a combinatorial optimization problem with a wide range of applications from finance and economics to machine learning. QUBO is ...

Query-level feature A query-level feature or QLF is a ranking feature utilized in a machine-learned ranking algorithm. Example QLFs: * How many times has this query been run in the last month? * How many words are in the query? * What is the sum/average/min/max/media ...

Quickprop Quickprop is an iterative method for determining the minimum of the loss function of an artificial neural network, following an algorithm inspired by the Newton's method. Sometimes, the algorithm is classified to the group of the second order lear ...

Radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...

Randomized weighted majority algorithm The randomized weighted majority algorithm is an algorithm in machine learning theory. It improves the mistake bound of the weighted majority algorithm. Example Imagine that every morning before the stock market opens, we get a prediction from ...

Reinforcement learning Reinforcement learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Reinforcement learning is one of three basic machine ...

Repeated incremental pruning to produce error reduction (RIPPER) In machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial int ...

Rprop Rprop, short for resilient backpropagation, is a learning heuristic for supervised learning in feedforward artificial neural networks. This is a first-order optimization algorithm. This algorithm was created by Martin Riedmiller and Heinrich Brau ...

Rule-based machine learning Rule-based machine learning (RBML) is a term in computer science intended to encompass any machine learning method that identifies, learns, or evolves 'rules' to store, manipulate or apply. The defining characteristic of a rule-based machine learne ...

* Skill chaining *

Sparse PCA Sparse principal component analysis (sparse PCA) is a specialised technique used in statistical analysis and, in particular, in the analysis of multivariate data sets. It extends the classic method of principal component analysis (PCA) for the red ...

State–action–reward–state–action State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning. It was proposed by Rummery and Niranjan in a technical note with the nam ...

Stochastic gradient descent Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can be regarded as a stochastic approximation of ...

* Structured kNN *

T-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

Temporal difference learning Temporal difference (TD) learning refers to a class of model-free reinforcement learning methods which learn by bootstrapping from the current estimate of the value function. These methods sample from the environment, like Monte Carlo methods, a ...

Wake-sleep algorithm The wake-sleep algorithm is an unsupervised learning algorithm for a stochastic multilayer neural network. The algorithm adjusts the parameters so as to produce a good density estimator. There are two learning phases, the “wake” phase and the ...

Weighted majority algorithm (machine learning) In machine learning, weighted majority algorithm (WMA) is a meta learning algorithm used to construct a compound algorithm from a pool of prediction algorithms, which could be any type of learning algorithms, classifiers, or even real human exper ...

Machine learning methods

Instance-based algorithm

(KNN) *

Learning vector quantization In computer science, learning vector quantization (LVQ) is a prototype-based supervised classification algorithm. LVQ is the supervised counterpart of vector quantization systems. Overview LVQ can be understood as a special case of an artifici ...

(LVQ) *

Self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the to ...

(SOM)

Regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...

Logistic regression In statistics, the logistic model (or logit model) is a statistical model that models the probability of an event taking place by having the log-odds for the event be a linear function (calculus), linear combination of one or more independent var ...

Ordinary least squares regression In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the pri ...

(OLSR) *

Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...

Stepwise regression In statistics, stepwise regression is a method of fitting regression models in which the choice of predictive variables is carried out by an automatic procedure. In each step, a variable is considered for addition to or subtraction from the set of ...

Multivariate adaptive regression splines In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automaticall ...

(MARS) * Regularization algorithm **

Ridge regression Ridge regression is a method of estimating the coefficients of multiple-regression models in scenarios where the independent variables are highly correlated. It has been used in many fields including econometrics, chemistry, and engineering. Also ...

Least Absolute Shrinkage and Selection Operator In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso or LASSO) is a regression analysis method that performs both variable selection and regularization in order to enhance the prediction accuracy ...

(LASSO) ** Elastic net **

Least-angle regression In statistics, least-angle regression (LARS) is an algorithm for fitting linear regression models to high-dimensional data, developed by Bradley Efron, Trevor Hastie, Iain Johnstone and Robert Tibshirani. Suppose we expect a response variable ...

(LARS) * Classifiers **

Probabilistic classifier In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation s ...

***

Naive Bayes classifier In statistics, naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features (see Bayes classifier). They are among the simplest Baye ...

** Binary classifier **

Linear classifier In the field of machine learning, the goal of statistical classification is to use an object's characteristics to identify which class (or group) it belongs to. A linear classifier achieves this by making a classification decision based on the val ...

Hierarchical classifier Hierarchical classification is a system of grouping things according to a hierarchy. In the field of machine learning, hierarchical classification is sometimes referred to as instance space decomposition, which splits a complete multi-class pro ...

Dimensionality reduction

Dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...

Canonical correlation analysis In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X'' = (''X''1, ..., ''X'n'') and ''Y' ...

(CCA) *

Factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...

Feature extraction In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning a ...

Feature selection In machine learning and statistics, feature selection, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construc ...

Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...

(ICA) *

Linear discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

(LDA) *

Multidimensional scaling Multidimensional scaling (MDS) is a means of visualizing the level of similarity of individual cases of a dataset. MDS is used to translate "information about the pairwise 'distances' among a set of n objects or individuals" into a configurati ...

(MDS) *

(NMF) *

Partial least squares regression Partial least squares regression (PLS regression) is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a li ...

(PLSR) *

Principal component analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...

(PCA) *

Principal component regression In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCR is used for estimating the unknown regression coefficients in a standard linear regr ...

(PCR) *

Projection pursuit Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more inter ...

Sammon mapping Sammon mapping or Sammon projection is an algorithm that maps a high-dimensional space to a space of lower dimensionality (see multidimensional scaling) by trying to preserve the structure of inter-point distances in high-dimensional space in the lo ...

t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally de ...

(t-SNE)

Ensemble learning

Ensemble learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statisti ...

AdaBoost AdaBoost, short for ''Adaptive Boosting'', is a statistical classification meta-algorithm formulated by Yoav Freund and Robert Schapire in 1995, who won the 2003 Gödel Prize for their work. It can be used in conjunction with many other types of ...

* Boosting *

(Bagging) *

Ensemble averaging In machine learning, particularly in the creation of artificial neural networks, ensemble averaging is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ens ...

– process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model. Frequently an ensemble of models performs better than any individual model, because the various errors of the models "average out." *

Gradient boosted decision tree Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. When a decision tre ...

(GBDT) *

Gradient boosting Gradient boosting is a machine learning technique used in regression and classification tasks, among others. It gives a prediction model in the form of an ensemble of weak prediction models, which are typically decision trees. When a decision t ...

machine (GBM) *

Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...

* Stacked Generalization (blending)

Meta-learning

Inductive bias The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered. In machine learning, one aims to construct algorithms that a ...

Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...

Reinforcement learning

(SARSA) *

(TD) *

Learning Automata A learning automaton is one type of machine learning algorithm studied since 1970s. Learning automata select their current action based on past experiences from the environment. It will fall into the range of reinforcement learning if the environme ...

Supervised learning

Supervised learning Supervised learning (SL) is a machine learning paradigm for problems where the available data consists of labelled examples, meaning that each data point contains features (covariates) and an associated label. The goal of supervised learning alg ...

Averaged one-dependence estimators Averaged one-dependence estimators (AODE) is a probabilistic classification learning technique. It was developed to address the attribute-independence problem of the popular naive Bayes classifier. It frequently develops substantially more accur ...

(AODE) *

Artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...

Case-based reasoning In artificial intelligence and philosophy, case-based reasoning (CBR), broadly construed, is the process of solving new problems based on the solutions of similar past problems. In everyday life, an auto mechanic who fixes an engine by recalli ...

Gaussian process regression In statistics, originally in geostatistics, kriging or Kriging, also known as Gaussian process regression, is a method of interpolation based on Gaussian process governed by prior covariances. Under suitable assumptions of the prior, kriging giv ...

Gene expression programming In computer programming, gene expression programming (GEP) is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and compos ...

Group method of data handling Group method of data handling (GMDH) is a family of inductive algorithms for computer-based mathematical modeling of multi-parametric datasets that features fully automatic structural and parametric optimization of models. GMDH is used in such fiel ...

(GMDH) * Inductive logic programming *

Instance-based learning In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have b ...

Lazy learning In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data bef ...

Learning Vector Quantization In computer science, learning vector quantization (LVQ) is a prototype-based supervised classification algorithm. LVQ is the supervised counterpart of vector quantization systems. Overview LVQ can be understood as a special case of an artifici ...

Logistic Model Tree In computer science, a logistic model tree (LMT) is a classification model with an associated supervised training algorithm that combines logistic regression (LR) and decision tree learning. Logistic model trees are based on the earlier idea of a ...

Minimum message length Minimum message length (MML) is a Bayesian information-theoretic method for statistical model comparison and selection. It provides a formal information theory restatement of Occam's Razor: even when models are equal in their measure of fit-accurac ...

(decision trees, decision graphs, etc.) **

Nearest Neighbor Algorithm The nearest neighbour algorithm was one of the first algorithms used to solve the travelling salesman problem approximately. In that problem, the salesman starts at a random city and repeatedly visits the nearest city until all have been visited. ...

Analogical modeling Analogical modeling (AM) is a formal theory of exemplar based analogical reasoning, proposed by Royal Skousen, professor of Linguistics and English language at Brigham Young University in Provo, Utah. It is applicable to language modeling and othe ...

Probably approximately correct learning In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the A ...

(PAC) learning *

Ripple down rules Ripple-down rules (RDR) are a way of approaching knowledge acquisition. Knowledge acquisition refers to the transfer of knowledge from human experts to knowledge-based systems. Introductory material Ripple-down rules are an incremental approac ...

, a knowledge acquisition methodology * Symbolic machine learning algorithms *

Support vector machine In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. Developed at AT&T Bell Laboratorie ...

s *

Random Forests Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...

Ensembles of classifiers In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statisti ...

(bagging) **

Boosting (meta-algorithm) In machine learning, boosting is an ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones. Boosting is based on the que ...

* Ordinal classification * Information fuzzy networks (IFN) *

Conditional Random Field Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without consid ...

ANOVA Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...

Quadratic classifier In statistics, a quadratic classifier is a statistical classifier that uses a quadratic decision surface to separate measurements of two or more classes of objects or events. It is a more general version of the linear classifier. The classific ...

s *

k-nearest neighbor In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and reg ...

* Boosting ** SPRINT *

Bayesian network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...

s **

Naive Bayes In statistics, naive Bayes classifiers are a family of simple "Probabilistic classification, probabilistic classifiers" based on applying Bayes' theorem with strong (naive) statistical independence, independence assumptions between the features (s ...

Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...

s **

Hierarchical hidden Markov model The hierarchical hidden Markov model (HHMM) is a statistical model derived from the hidden Markov model (HMM). In an HHMM, each state is considered to be a self-contained probabilistic model. More precisely, each state of the HHMM is itself an HHMM ...

Bayesian

Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...

* Bayesian knowledge base *

* Gaussian Naive Bayes * Multinomial Naive Bayes *

Averaged One-Dependence Estimators Averaged one-dependence estimators (AODE) is a probabilistic classification learning technique. It was developed to address the attribute-independence problem of the popular naive Bayes classifier. It frequently develops substantially more accur ...

(AODE) *

Bayesian Belief Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...

(BBN) *

Bayesian Network A Bayesian network (also known as a Bayes network, Bayes net, belief network, or decision network) is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). Bay ...

(BN)

Decision tree algorithms

Decision tree algorithm *

Decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains condit ...

Classification and regression tree Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of obse ...

(CART) * Iterative Dichotomiser 3 (ID3) *

C4.5 algorithm C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referr ...

* C5.0 algorithm *

Chi-squared Automatic Interaction Detection Chi-square automatic interaction detection (CHAID) is a decision tree technique based on adjusted significance testing (Bonferroni correction, Holm-Bonferroni testing). The technique was developed in South Africa and was published in 1980 by Gor ...

(CHAID) *

Decision stump A decision stump is a machine learning model consisting of a one-level decision tree. That is, it is a decision tree with one internal node (the root) which is immediately connected to the terminal nodes (its leaves). A decision stump makes a predi ...

* Conditional decision tree *

ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is the ...

Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...

* SLIQ

Linear classifier

Fisher's linear discriminant Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...

Multinomial logistic regression In statistics, multinomial logistic regression is a statistical classification, classification method that generalizes logistic regression to multiclass classification, multiclass problems, i.e. with more than two possible discrete outcomes. T ...

Perceptron In machine learning, the perceptron (or McCulloch-Pitts neuron) is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belon ...

Unsupervised learning

Unsupervised learning Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and t ...

* Expectation-maximization algorithm *

Vector Quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...

Generative topographic map Generative topographic map (GTM) is a machine learning method that is a probabilistic counterpart of the self-organizing map (SOM), is probably convergent and does not require a shrinking neighborhood or a decreasing step size. It is a generative mo ...

* Information bottleneck method *

Association rule learning Association rule learning is a rule-based machine learning method for discovering interesting relations between variables in large databases. It is intended to identify strong rules discovered in databases using some measures of interestingness.Pi ...

algorithms **

Apriori algorithm AprioriRakesh Agrawal and Ramakrishnan SrikanFast algorithms for mining association rules Proceedings of the 20th International Conference on Very Large Data Bases, VLDB, pages 487-499, Santiago, Chile, September 1994. is an algorithm for frequent ...

** Eclat algorithm

Artificial neural networks

Feedforward neural network A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do ''not'' form a cycle. As such, it is different from its descendant: recurrent neural networks. The feedforward neural network was the ...

Extreme learning machine Extreme learning machines are feedforward neural networks for statistical classification, classification, regression analysis, regression, Cluster analysis, clustering, sparse approximation, compression and feature learning with a single layer or ...

Convolutional neural network In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Netwo ...

Recurrent neural network A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic ...

** Long short-term memory (LSTM) * Logic learning machine *

Association rule learning

* Eclat algorithm * FP-growth algorithm

Hierarchical clustering

Hierarchical clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into ...

Single-linkage clustering In statistics, single-linkage clustering is one of several methods of hierarchical clustering. It is based on grouping clusters in bottom-up fashion (agglomerative clustering), at each step combining two clusters that contain the closest pair of el ...

Conceptual clustering Conceptual clustering is a machine learning paradigm for unsupervised classification that has been defined by Ryszard S. Michalski in 1980 (Fisher 1987, Michalski 1980) and developed mainly during the 1980s. It is distinguished from ordinary data ...

Cluster analysis

Cluster analysis Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...

BIRCH A birch is a thin-leaved deciduous hardwood tree of the genus ''Betula'' (), in the family Betulaceae, which also includes alders, hazels, and hornbeams. It is closely related to the beech-oak family Fagaceae. The genus ''Betula'' contains 30 ...

DBSCAN Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm: give ...

* Expectation-maximization (EM) *

Fuzzy clustering Fuzzy clustering (also referred to as soft clustering or soft ''k''-means) is a form of clustering in which each data point can belong to more than one cluster. Clustering or cluster analysis involves assigning data points to clusters such that i ...

Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into ...

K-means clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or ...

K-medians In statistics, ''k''-medians clusteringP. S. Bradley, O. L. Mangasarian, and W. N. Street, "Clustering via Concave Minimization," in Advances in Neural Information Processing Systems, vol. 9, M. C. Mozer, M. I. Jordan, and T. Petsche, Eds. Cambridg ...

Mean-shift Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. ...

OPTICS algorithm Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data. It was presented by Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel and Jörg Sander. Its basic idea is simil ...

Anomaly detection

Anomaly detection In data analysis, anomaly detection (also referred to as outlier detection and sometimes as novelty detection) is generally understood to be the identification of rare items, events or observations which deviate significantly from the majority ...

* ''k''-nearest neighbors algorithm (''k''-NN) *

Semi-supervised learning

Semi-supervised learning Weak supervision is a branch of machine learning where noisy, limited, or imprecise sources are used to provide supervision signal for labeling large amounts of training data in a supervised learning setting. This approach alleviates the burden of o ...

Active learning Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students partici ...

– special case of semi-supervised learning in which a learning algorithm is able to interactively query the user (or some other information source) to obtain the desired outputs at new data points. * Generative models * Low-density separation * Graph-based methods *

Co-training Co-training is a machine learning algorithm used when there are only small amounts of labeled data and large amounts of unlabeled data. One of its uses is in text mining for search engines. It was introduced by Avrim Blum and Tom Mitchell in 1998. ...

* Transduction

Deep learning

Deep learning Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised. De ...

Deep belief network In machine learning, a deep belief network (DBN) is a generative graphical model, or alternatively a class of deep neural network, composed of multiple layers of latent variables ("hidden units"), with connections between the layers but not betw ...

s * Deep

Boltzmann machine A Boltzmann machine (also called Sherrington–Kirkpatrick model with external field or stochastic Ising–Lenz–Little model) is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic ...

s * Deep

s *

Hierarchical temporal memory Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book ''On Intelligence'' by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for ...

Generative Adversarial Network A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is a ...

** Style transfer *

Transformer A transformer is a passive component that transfers electrical energy from one electrical circuit to another circuit, or multiple circuits. A varying current in any coil of the transformer produces a varying magnetic flux in the transformer' ...

* Stacked Auto-Encoders

Machine learning research

List of artificial intelligence projects The following is a list of current and past, non-classified notable artificial intelligence projects. Specialized projects Brain-inspired * Blue Brain Project, an attempt to create a synthetic brain by reverse-engineering the mammalian brain do ...

List of datasets for machine learning research These datasets are applied for machine learning research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning ...

History of machine learning

History of machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

* Timeline of machine learning

Machine learning projects

Machine learning projects *

DeepMind DeepMind Technologies is a British artificial intelligence subsidiary of Alphabet Inc. and research laboratory founded in 2010. DeepMind was List of mergers and acquisitions by Google, acquired by Google in 2014 and became a wholly owned subsid ...

Google Brain Google Brain is a deep learning artificial intelligence research team under the umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, Google Brain combines open-ended machine learning research w ...

OpenAI OpenAI is an artificial intelligence (AI) research laboratory consisting of the for-profit corporation OpenAI LP and its parent company, the non-profit OpenAI Inc. The company conducts research in the field of AI with the stated goal of promo ...

Meta AI Meta AI is an artificial intelligence laboratory that belongs to Meta Platforms Inc. (formerly known as Facebook, Inc.) Meta AI intends to develop various forms of artificial intelligence, improving augmented and artificial reality technologies ...

Machine learning organizations

Machine learning organizations *

Knowledge Engineering and Machine Learning Group The Knowledge Engineering and Machine Learning group (KEMLg) is a research group belonging to the Technical University of Catalonia (UPC) – BarcelonaTech. It was founded by Prof. Ulises Cortés. The group has been active in the Artificial I ...

Machine learning conferences and workshops

* Artificial Intelligence and Security (AISec) (co-located workshop with CCS) *

Conference on Neural Information Processing Systems The Conference and Workshop on Neural Information Processing Systems (abbreviated as NeurIPS and formerly NIPS) is a machine learning and computational neuroscience conference held every December. The conference is currently a double-track meeti ...

(NIPS) *

ECML PKDD ECML PKDD, the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, is one of the leading academic conferences on machine learning and knowledge discovery, held in Europe every year. History ECML ...

International Conference on Machine Learning The International Conference on Machine Learning (ICML) is the leading international academic conference in machine learning. Along with NeurIPS and ICLR, it is one of the three primary conferences of high impact in machine learning and artificia ...

(ICML)
ML4ALL
(Machine Learning For All)

Machine learning publications

Books on machine learning

Books about machine learning

Machine learning journals

* ''

Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

'' * ''

Journal of Machine Learning Research The ''Journal of Machine Learning Research'' is a peer-reviewed open access scientific journal covering machine learning. It was established in 2000 and the first editor-in-chief was Leslie Kaelbling. The current editors-in-chief are Francis Ba ...

'' (JMLR) * ''

Neural Computation Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the th ...

Persons influential in machine learning

Alberto Broggi Alberto Broggi is General Manager at VisLab srl (spinoff of the University of Parma acquired by Silicon-Valley company Ambarella Inc. in June 2015) and a professor of Computer Engineering at the University of Parma in Italy. Research in compu ...

* Andrei Knyazev *

Andrew McCallum Andrew McCallum is a professor in the computer science department at University of Massachusetts Amherst. His primary specialties are in machine learning, natural language processing, information extraction, information integration, and social n ...

Andrew Ng Andrew Yan-Tak Ng (; born 1976) is a British-born American computer scientist and technology entrepreneur focusing on machine learning and AI. Ng was a co-founder and head of Google Brain and was the former Chief Scientist at Baidu, building ...

* Anuraag Jain *

Armin B. Cremers Armin Bernd Cremers (born June 7, 1946) is a German mathematician and computer scientist. He is a professor in the computer science institute at the University of Bonn, Germany. He is most notable for his contributions to several fields of discre ...

Ayanna Howard Ayanna MacCalla Howard (born January 24, 1972) is an American roboticist, entrepreneur and educator currently serving as the dean of the College of Engineering at Ohio State University. Assuming the post in March 2021, Howard became the first wom ...

Barney Pell Barney Pell (born March 18, 1968) is an American entrepreneur, angel investor and computer scientist. He was co-founder and CEO of Powerset, a pioneering natural language search startup, search strategist and architect for Microsoft's Bing searc ...

Ben Goertzel Ben Goertzel is a cognitive scientist, artificial intelligence researcher, CEO and founder of SingularityNET, leader of the OpenCog Foundation, and the AGI Society, and chair of Humanity+. He helped popularize the term 'artificial general intell ...

Ben Taskar Ben Taskar (March 3, 1977 – November 18, 2013) was a professor and researcher in the area of machine learning and applications to computational linguistics and computer vision. He was a Magerman Term Associate Professor for Computer and Info ...

Bernhard Schölkopf Bernhard Schölkopf is a German computer scientist (born 20 February 1968) known for his work in machine learning, especially on kernel methods and causality. He is a director at the Max Planck Institute for Intelligent Systems in Tübingen, Ge ...

Brian D. Ripley Brian David Ripley FRSE (born 29 April 1952) is a British statistician. From 1990, he was professor of applied statistics at the University of Oxford and is also a professorial fellow at St Peter's College, Oxford, St Peter's College. He retired ...

* Christopher G. Atkeson *

Corinna Cortes Corinna Cortes is a Danish computer scientist known for her contributions to machine learning. She is currently the Head of Google Research, New York City, New York. Cortes is a recipient of the Paris Kanellakis Award, Paris Kanellakis Theory and ...

Demis Hassabis Demis Hassabis (born 27 July 1976) is a British artificial intelligence researcher and entrepreneur. In his early career he was a video game AI programmer and designer, and an expert player of board games. He is the chief executive officer and ...

Douglas Lenat Douglas Bruce Lenat (born 1950) is the CEO of Cycorp, Inc. of Austin, Texas, and has been a prominent researcher in artificial intelligence; he was awarded the biannual IJCAI Computers and Thought Award in 1976 for creating the machine learning p ...

* Eric Xing * Ernst Dickmanns *

Geoffrey Hinton Geoffrey Everest Hinton One or more of the preceding sentences incorporates text from the royalsociety.org website where: (born 6 December 1947) is a British-Canadian cognitive psychologist and computer scientist, most noted for his work on ar ...

– co-inventor of the backpropagation and contrastive divergence training algorithms *

Hans-Peter Kriegel Hans-Peter Kriegel (1 October 1948, Germany) is a German computer scientist and professor at the Ludwig Maximilian University of Munich and leading the Database Systems Group in the Department of Computer Science. He was previously professor at ...

Hartmut Neven Hartmut Neven (born 1964) is a scientist working in quantum computing, computer vision, robotics and computational neuroscience. He is best known for his work in face and object recognition and his contributions to quantum machine learning. He is ...

Heikki Mannila Heikki Olavi Mannila (born 4 January 1960 in Espoo) is a Finnish computer scientist, the president of the Academy of Finland.''Kuka kukin on 2007'', p. 585. Helsinki 2006. Mannila earned his Ph.D. in 1985 from the University of Helsinki under t ...

Ian Goodfellow Ian J. Goodfellow (born ) is a computer scientist, engineer, and executive, most noted for his work on artificial neural networks and deep learning. He was previously employed as a research scientist at Google Brain and director of machine lea ...

– Father of Generative & adversarial networks * Jacek M. Zurada *

Jaime Carbonell Jaime Guillermo Carbonell (July 29, 1953 – February 28, 2020) was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation result ...

* Jeremy Slovak *

Jerome H. Friedman Jerome Harold Friedman (born December 29, 1939) is an American statistician, consultant and Professor of Statistics at Stanford University, known for his contributions in the field of statistics and data mining.

John D. Lafferty John D. Lafferty is an American scientist, Professor at Yale University and leading researcher in machine learning. He is best known for proposing the Conditional Random Fields with Andrew McCallum and Fernando C.N. Pereira. Biography In 2017, ...

* John Platt – invented SMO and Platt scaling *

Julie Beth Lovins Julie Beth Lovins (October 19, 1945, in Washington, D.C. – January 26, 2018, in Mountain View, California) was a Computational linguistics, computational linguist who published the The Lovins Stemming Algorithm - a type of stemming algorithmfor w ...

Jürgen Schmidhuber Jürgen Schmidhuber (born 17 January 1963) is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artif ...

Karl Steinbuch Karl W. Steinbuch (June 15, 1917 in Stuttgart-Bad Cannstatt – June 4, 2005 in Ettlingen) was a German computer scientist, cyberneticist, and electrical engineer. He was an early and influential researcher of German computer science, and was ...

Katia Sycara Ekaterini Panagiotou Sycara ( el, Κάτια Συκαρά) is a Greek computer scientist. She is an Edward Fredkin Research Professor of Robotics in the Robotics Institute, School of Computer Science at Carnegie Mellon University internationally ...

Leo Breiman Leo Breiman (January 27, 1928 – July 5, 2005) was a distinguished statistician at the University of California, Berkeley. He was the recipient of numerous honors and awards, and was a member of the United States National Academy of Sciences. ...

– invented bagging and random forests *

Lise Getoor Lise Getoor is a professor in the computer science department, at the University of California, Santa Cruz, and an adjunct professor in the Computer Science Department at the University of Maryland, College Park. Her primary research interests a ...

Luca Maria Gambardella Luca Maria Gambardella (born 4 January 1962) is an Italian computer scientist and author. He is the former director of the Dalle Molle Institute for Artificial Intelligence Research in Manno, in the Ticino canton of Switzerland. With Marco Dor ...

Léon Bottou Léon Bottou (born 1965) is a researcher best known for his work in machine learning and data compression. His work presents stochastic gradient descent as a fundamental learning algorithm. He is also one of the main creators of the DjVu image comp ...

Marcus Hutter Marcus Hutter (born April 14, 1967 in Munich) is DeepMind Senior Scientist researching the mathematical foundations of artificial general intelligence. He is on leave from his professorship at the ANU College of Engineering and Computer Scie ...

Mehryar Mohri Mehryar Mohri is a Professor and theoretical computer scientist at the Courant Institute of Mathematical Sciences. He is also a Research Director at Google Research where he heads the Learning Theory team. Career Prior to joining the Courant In ...

Michael Collins Michael Collins or Mike Collins most commonly refers to: * Michael Collins (Irish leader) (1890–1922), Irish revolutionary leader, soldier, and politician * Michael Collins (astronaut) (1930–2021), American astronaut, member of Apollo 11 and Ge ...

Michael I. Jordan Michael Irwin Jordan (born February 25, 1956) is an American scientist, professor at the University of California, Berkeley and researcher in machine learning, statistics, and artificial intelligence. Jordan was elected a member of the Nation ...

* Michael L. Littman *

Nando de Freitas Nando de Freitas is a researcher in the field of machine learning, and in particular in the subfields of neural networks, Bayesian inference and Bayesian optimization, and deep learning. Biography De Freitas was born in Zimbabwe. He did his ...

* Ofer Dekel *

Oren Etzioni Oren Etzioni (born 1964) is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective ...

Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for Markov logic network enabling uncertain inference. Education Domingos received an und ...

Peter Flach Pieter Adriaan Flach (born 8 April 1961, Sneek) is a Dutch computer scientist and a Professor of Artificial Intelligence in the Department of Computer Science at the University of Bristol. He is author of the acclaimed Simply Logical: Intellige ...

Pierre Baldi Pierre Baldi is a distinguished professor of computer science at University of California Irvine and the director of its Institute for Genomics and Bioinformatics. Education and early life Born in Rome (Italy), Pierre Baldi received his Bache ...

Pushmeet Kohli Pushmeet Kohli is a computer scientist at Google DeepMind where he heads the "Robust and Reliable AI" and "AI for Science" teams. Before joining DeepMind, he was partner scientist and director of research at Microsoft Research and a post-doctor ...

Ray Kurzweil Raymond Kurzweil ( ; born February 12, 1948) is an American computer scientist, author, inventor, and futurist. He is involved in fields such as optical character recognition (OCR), text-to-speech synthesis, speech recognition technology, and e ...

Rayid Ghani Rayid Ghani is a Distinguished Career Professor in thMachine Learning Department(in the School of Computer Science) and the Heinz College of Information Systems and Public Policy at Carnegie Mellon University. Previously, he was the Director of t ...

Ross Quinlan John Ross Quinlan is a computer science researcher in data mining and decision theory. He has contributed extensively to the development of decision tree algorithms, including inventing the canonical C4.5 and ID3 algorithms. He also contributed to ...

* Salvatore J. Stolfo *

Sebastian Thrun Sebastian Thrun (born May 14, 1967) is a German-American entrepreneur, educator, and computer scientist. He is CEO of Kitty Hawk Corporation, and chairman and co-founder of Udacity. Before that, he was a Google VP and Fellow, a Professor of Comp ...

Selmer Bringsjord Selmer Bringsjord (born November 24, 1958) is the chair of the Department of Cognitive Science at Rensselaer Polytechnic Institute and a professor of Computer Science and Cognitive Science. He also holds an appointment in the Lally School of Man ...

Sepp Hochreiter Josef "Sepp" Hochreiter (born 14 February 1967) is a German computer scientist. Since 2018 he has led the Institute for Machine Learning at the Johannes Kepler University of Linz after having led the Institute of Bioinformatics from 2006 to 2018. ...

Shane Legg Shane Legg is a machine learning research director and digital technology entrepreneur who did an AI-related postdoctoral fellowship at University College London's Gatsby Computational Neuroscience Unit, after doctoral work at the Istituto Da ...

Stephen Muggleton Stephen H. Muggleton FBCS, FIET, FAAAI, FECCAI, FSB, FREng (born 6 December 1959, son of Louis Muggleton) is Professor of Machine Learning and Head of the Computational Bioinformatics Laboratory at Imperial College London.Steve Omohundro Stephen Malvern Omohundro (born 1959) is an American computer scientist whose areas of research include Hamiltonian physics, dynamical systems, programming languages, machine learning, machine vision, and the social implications of artificial int ...

Tom M. Mitchell Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former Chair of the Machine Learning Department at CMU. Mitchell is known ...

Trevor Hastie Trevor John Hastie (born 27 June 1953) is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for ...

Vasant Honavar Vasant G. Honavar is an Indian born American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor. E ...

Vladimir Vapnik Vladimir Naumovich Vapnik (russian: Владимир Наумович Вапник; born 6 December 1936) is one of the main developers of the Vapnik–Chervonenkis theory of statistical learning, and the co-inventor of the support-vector machine ...

– co-inventor of the SVM and VC theory *

Yann LeCun Yann André LeCun ( , ; originally spelled Le Cun; born 8 July 1960) is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor ...

– invented convolutional neural networks *

Yasuo Matsuyama Yasuo Matsuyama (born March 23, 1947) is a Japanese researcher in machine learning and human-aware information processing. Matsuyama is a Professor Emeritus and an Honorary Researcher of the Research Institute of Science and Engineering of Wased ...

Yoshua Bengio Yoshua Bengio (born March 5, 1964) is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de ...

Zoubin Ghahramani Zoubin Ghahramani FRS ( fa, زوبین قهرمانی; born 8 February 1970) is a British-Iranian researcher and Professor of Information Engineering at the University of Cambridge. He holds joint appointments at University College London and t ...

Bayesian interpretation of kernel regularization Within bayesian statistics for machine learning, kernel methods arise from the assumption of an inner product space or similarity structure on inputs. For some such methods, such as support vector machines (SVMs), the original formulation and its r ...

* Bayesian optimization *

Bayesian structural time series Bayesian structural time series (BSTS) model is a statistical technique used for feature selection, time series forecasting, nowcasting, inferring causal impact and other applications. The model is designed to work with time series data. The mode ...

Bees algorithm In computer science and operations research, the bees algorithm is a population-based search algorithm which was developed by Pham, Ghanbarzadeh et al. in 2005.Pham DT, Ghanbarzadeh A, Koc E, Otri S, Rahim S and Zaidi M. The Bees Algorithm. Technic ...

Behavioral clustering Behavioral clustering is a statistical analysis method used in retailing to identify consumer purchase trends and group stores based on consumer buying behaviors. Traditional versus behavioral clustering Traditional clustering Historically, reta ...

Bernoulli scheme In mathematics, the Bernoulli scheme or Bernoulli shift is a generalization of the Bernoulli process to more than two possible outcomes. Bernoulli schemes appear naturally in symbolic dynamics, and are thus important in the study of dynamical sys ...

Bias–variance tradeoff In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters. The bias–variance dil ...

Biclustering Biclustering, block clustering, Co-clustering or two-mode clustering is a data mining technique which allows simultaneous clustering of the rows and columns of a matrix. The term was first introduced by Boris Mirkin to name a technique introduce ...

* BigML *

Binary classification Binary classification is the task of classifying the elements of a set into two groups (each called ''class'') on the basis of a classification rule. Typical binary classification problems include: * Medical testing to determine if a patient has c ...

* Bing Predicts *

Bio-inspired computing Bio-inspired computing, short for biologically inspired computing, is a field of study which seeks to solve computer science problems using models of biology. It relates to connectionism, social behavior, and emergence. Within computer science, b ...

* Biogeography-based optimization *

Biplot Biplots are a type of exploratory graph used in statistics, a generalization of the simple two-variable scatterplot. A biplot overlays a ''score plot'' with a ''loading plot''. A biplot allows information on both samples and variables of a dat ...

Bondy's theorem In mathematics, Bondy's theorem is a bound on the number of elements needed to distinguish the sets in a family of sets from each other. It belongs to the field of combinatorics, and is named after John Adrian Bondy, who published it in 1972. Sta ...

Bongard problem A Bongard problem is a kind of puzzle invented by the Russian computer scientist Mikhail Moiseevich Bongard (Михаил Моисеевич Бонгард, 1924–1971), probably in the mid-1960s. They were published in his 1967 book on pattern re ...

Bradley–Terry model The Bradley–Terry model is a probability model that can predict the outcome of a paired comparison. Given a pair of individuals and drawn from some population, it estimates the probability that the pairwise comparison turns out true, as :P(i > ...

BrownBoost BrownBoost is a boosting algorithm that may be robust to noisy datasets. BrownBoost is an adaptive version of the boost by majority algorithm. As is true for all boosting algorithms, BrownBoost is used in conjunction with other machine learning ...

Brown clustering Brown clustering is a hard hierarchical agglomerative clustering problem based on distributional information proposed by Peter Brown, William A. Brown, Vincent Della Pietra, Peter V. de Souza, Jennifer Lai, and Robert Mercer. It is typically appl ...

Burst error In telecommunication, a burst error or error burst is a contiguous sequence of symbols, received over a communication channel, such that the first and last symbols are in error and there exists no contiguous subsequence of ''m'' correctly receive ...

* CBCL (MIT) *

CIML community portal The computational intelligence and machine learning (CIML) community portal is an international multi-university initiative. Its primary purpose is to help facilitate a virtual scientific community infrastructure for all those involved with, or int ...

CMA-ES Covariance matrix adaptation evolution strategy (CMA-ES) is a particular kind of strategy for numerical optimization. Evolution strategies (ES) are stochastic, derivative-free methods for numerical optimization of non-linear or non-convex continuo ...

CURE data clustering algorithm CURE (Clustering Using REpresentatives) is an efficient data clustering algorithm for large databases. Compared with K-means clustering it is more robust to outliers and able to identify clusters having non-spherical shapes and size variances. D ...

Cache language model A cache language model is a type of statistical language model. These occur in the natural language processing subfield of computer science and assign probabilities to given sequences of words by means of a probability distribution. Statistical lan ...

Calibration (statistics) There are two main uses of the term calibration in statistics that denote special types of statistical inference problems. "Calibration" can mean :*a reverse process to regression, where instead of a future dependent variable being predicted fro ...

Canonical correspondence analysis In multivariate analysis, canonical correspondence analysis (CCA) is an ordination technique that determines axes from the response data as a linear combination of measured predictors. CCA is commonly used in ecology in order to extract gradients th ...

Canopy clustering algorithm The canopy clustering algorithm is an unsupervised pre- clustering algorithm introduced by Andrew McCallum, Kamal Nigam and Lyle Ungar in 2000. It is often used as preprocessing step for the K-means algorithm or the Hierarchical clustering algorithm ...

Cascading classifiers Cascading is a particular case of ensemble learning based on the concatenation of several classifiers, using all information collected from the output from a given classifier as additional information for the next classifier in the cascade. Unli ...

Category utility Category utility is a measure of "category goodness" defined in and . It attempts to maximize both the probability that two objects in the same category have attribute values in common, and the probability that objects from different categories ha ...

* CellCognition *

Cellular evolutionary algorithm A cellular evolutionary algorithm (cEA) is a kind of evolutionary algorithm (EA) in which individuals cannot mate arbitrarily, but every one interacts with its closer neighbors on which a basic EA is applied (selection, variation, replacement). ...

Chi-square automatic interaction detection Chi-square automatic interaction detection (CHAID) is a decision tree technique based on adjusted significance testing ( Bonferroni correction, Holm-Bonferroni testing). The technique was developed in South Africa and was published in 1980 by Gor ...

Chromosome (genetic algorithm) In genetic algorithms, a chromosome (also sometimes called a genotype) is a set of parameters which define a proposed solution to the problem that the genetic algorithm is trying to solve. The set of all solutions is known as the ''population''. T ...

* Classifier chains *

Cleverbot Cleverbot is a chatterbot web application that uses machine learning techniques to have conversations with humans. It was created by British AI scientist Rollo Carpenter. It was preceded by Jabberwacky, a chatbot project that began in 1988 and ...

Clonal selection algorithm In artificial immune systems, clonal selection algorithms are a class of algorithms inspired by the clonal selection theory of acquired immunity that explains how B and T lymphocytes improve their response to antigens over time called affinity ...

Cluster-weighted modeling In data mining, cluster-weighted modeling (CWM) is an algorithm-based approach to non-linear prediction of outputs (dependent variables) from inputs (independent variables) based on density estimation using a set of models (clusters) that are each ...

Clustering high-dimensional data Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology ...

Clustering illusion The clustering illusion is the tendency to erroneously consider the inevitable "streaks" or "clusters" arising in small samples from random distributions to be non-random. The illusion is caused by a human tendency to underpredict the amount of v ...

CoBoosting CoBoost is a semi-supervised training algorithm proposed by Collins and Singer in 1999. The original application for the algorithm was the task of Named Entity Classification using very weak learners.Michael Collins and Yoram Singer, Unsupervised M ...

Cobweb (clustering) COBWEB is an incremental system for hierarchical conceptual clustering. COBWEB was invented by Professor Douglas H. Fisher, currently at Vanderbilt University. COBWEB incrementally organizes observations into a classification tree Classification ...

* Cognitive computer *

Cognitive robotics Cognitive Robotics or Cognitive Technology is a subfield of robotics concerned with endowing a robot with intelligent behavior by providing it with a processing architecture that will allow it to learn and reason about how to behave in response t ...

Collostructional analysis Collostructional analysis is a family of methods developed by (in alphabetical order) Stefan Th. Gries (University of California, Santa Barbara) and Anatol Stefanowitsch (Free University of Berlin). Collostructional analysis aims at measuring the ...

Common-method variance In applied statistics, (e.g., applied to the social sciences and psychometrics), common-method variance (CMV) is the spurious "variance that is attributable to the measurement method rather than to the construct (philosophy of science), constructs ...

Complete-linkage clustering Complete-linkage clustering is one of several methods of agglomerative hierarchical clustering. At the beginning of the process, each element is in a cluster of its own. The clusters are then sequentially combined into larger clusters until all ...

Computer-automated design Design Automation usually refers to electronic design automation, or Design Automation which is a Product Configurator. Extending Computer-Aided Design (CAD), automated design and Computer-Automated Design (CAutoD) are more concerned with a broa ...

Concept class In computational learning theory in mathematics, a concept over a domain ''X'' is a total Boolean function over ''X''. A concept class is a class of concepts. Concept classes are a subject of computational learning theory. Concept class terminology ...

Concept drift In predictive analytics and machine learning, concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become ...

Conference on Artificial General Intelligence The Conference on Artificial General Intelligence is a meeting of researchers in the field of Artificial General Intelligence organized by thAGI Societyand held annually since 2008. The conference was initiated by the 2006 Bethesda Artificial Gen ...

Conference on Knowledge Discovery and Data Mining SIGKDD, representing the Association for Computing Machinery's (ACM) Special Interest Group (SIG) on Knowledge Discovery and Data Mining, hosts an influential annual conference. Conference history The KDD Conference grew from KDD (Knowledge Di ...

Confirmatory factor analysis In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research.Kline, R. B. (2010). ''Principles and practice of structural equation modeling (3rd ed.).'' New York, New York: Gu ...

Confusion matrix In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a su ...

Congruence coefficient In multivariate statistics, the congruence coefficient is an index of the similarity between factors that have been derived in a factor analysis. It was introduced in 1948 by Cyril Burt who referred to it as ''unadjusted correlation''. It is also ca ...

Connect (computer system) Connect is a new social network analysis software data mining computer system developed by HMRC (UK) that cross-references business's and people's tax records with other databases to establish fraudulent or undisclosed (misdirected) activity. Hi ...

Consensus clustering Consensus clustering is a method of aggregating (potentially conflicting) results from multiple clustering algorithms. Also called cluster ensembles or aggregation of clustering (or partitions), it refers to the situation in which a number of differ ...

Constrained clustering In computer science, constrained clustering is a class of semi-supervised learning algorithms. Typically, constrained clustering incorporates either a set of must-link constraints, cannot-link constraints, or both, with a data clustering Cluster ...

Constrained conditional model A constrained conditional model (CCM) is a machine learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative constraints. The constraint can be used as a way to incorporate e ...

Constructive cooperative coevolution The constructive cooperative coevolutionary algorithm (also called C3) is a global optimisation algorithm in artificial intelligence based on the multi-start architecture of the greedy randomized adaptive search procedure (GRASP). It incorporates t ...

Correlation clustering Clustering is the problem of partitioning data points into groups based on their similarity. Correlation clustering provides a method for clustering a set of objects into the optimum number of clusters without specifying that number in advance. De ...

Correspondence analysis Correspondence analysis (CA) is a multivariate statistical technique proposed by Herman Otto Hartley (Hirschfeld) and later developed by Jean-Paul Benzécri. It is conceptually similar to principal component analysis, but applies to categorical rat ...

Cortica Headquartered in Tel Aviv Cortica utilizes unsupervised learning methods to recognize and analyze digital images and video. The technology developed by the Cortica team is based on research of the function of the human brain. Company Founding Co ...

* Coupled pattern learner *

Cross-entropy method The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance ...

Cross-validation (statistics) Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-v ...

Crossover (genetic algorithm) In genetic algorithms and evolutionary computation, crossover, also called recombination, is a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new soluti ...

Cuckoo search In operations research, cuckoo search is an optimization algorithm developed by Xin-She Yang and Suash Deb in 2009. It was inspired by the obligate brood parasitism of some cuckoo species by laying their eggs in the nests of host birds of other s ...

Cultural algorithm Cultural algorithms (CA) are a branch of evolutionary computation where there is a knowledge component that is called the belief space in addition to the population component. In this sense, cultural algorithms can be seen as an extension to a co ...

* Cultural consensus theory *

Curse of dimensionality The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The ...

DADiSP DADiSP (Data Analysis and Display, pronounced day-disp) is a numerical computing environment developed by DSP Development Corporation which allows one to display and manipulate data series, matrices and images with an interface similar to a sp ...

DARPA LAGR Program {{short description, United States government program involved in the development of unmanned ground vehicles The Learning Applied to Ground Vehicles (LAGR) program, which ran from 2004 until 2008, had the goal of accelerating progress in autonomous ...

* Darkforest *

Dartmouth workshop The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely consideredKline, Ronald R., Cybernetics, Automata Studies and the Dartmouth Conference on Artificial Intelligence, IEEE Annals of the History of ...

DarwinTunes DarwinTunes was a research project into the use of natural selection to create music led by Bob MacCallum and Armand Leroi, scientists at Imperial College London. The project asks volunteers on the Internet to listen to automatically generated sound ...

Data Mining Extensions Data Mining Extensions (DMX) is a query language for data mining models supported by Microsoft's SQL Server Analysis Services product. Like SQL, it supports a data definition language, data manipulation language and a data query language, all t ...

Data exploration Data exploration is an approach similar to initial data analysis, whereby a data analyst uses visual exploration to understand what is in a dataset and the characteristics of the data, rather than through traditional data management systems.

Data pre-processing Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to ...

Data stream clustering In computer science, data stream clustering is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied as a streaming algorithm and ...

Dataiku Dataiku is an artificial intelligence (AI) and machine learning company which was founded in 2013. In December 2019, Dataiku announced that CapitalG—the late-stage growth venture capital fund financed by Alphabet Inc.—joined Dataiku as an inve ...

Davies–Bouldin index The Davies–Bouldin index (DBI), introduced by David L. Davies and Donald W. Bouldin in 1979, is a metric for evaluating clustering algorithms. This is an internal evaluation scheme, where the validation of how well the clustering has been d ...

Decision boundary __NOTOC__ In a statistical-classification problem with two classes, a decision boundary or decision surface is a hypersurface that partitions the underlying vector space into two sets, one for each class. The classifier will classify all the point ...

Decision list Decision lists are a representation for Boolean functions which can be easily learnable from examples. Single term decision lists are more expressive than disjunctions and conjunctions; however, 1-term decision lists are less expressive than the ...

Decision tree model In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of ''queries'' or ''tests'' that are done adaptively, so the outcome of the pre ...

Deductive classifier A deductive classifier is a type of artificial intelligence inference engine. It takes as input a set of declarations in a frame language about a domain such as medical research or molecular biology. For example, the names of classes, sub-classes, ...

DeepArt DeepArt or DeepArt.io was a website that allowed users to create artistic images by using an algorithm to redraw one image using the stylistic elements of another image. with "A Neural Algorithm of Artistic Style" a Neural Style Transfer algorith ...

DeepDream DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev that uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like appearance reminiscent ...

* Deep Web Technologies *

Defining length In genetic algorithms and genetic programming defining length L(H) is the maximum distance between two defining symbols (that is symbols that have a fixed value as opposed to symbols that can take any value, commonly denoted as # or *) in schema H. ...

Dendrogram A dendrogram is a diagram representing a tree. This diagrammatic representation is frequently used in different contexts: * in hierarchical clustering, it illustrates the arrangement of the clusters produced by the corresponding analyses. ...

* Dependability state model *

Detailed balance The principle of detailed balance can be used in kinetic systems which are decomposed into elementary processes (collisions, or steps, or elementary reactions). It states that at equilibrium, each elementary process is in equilibrium with its reve ...

Determining the number of clusters in a data set Determining the number of clusters in a data set, a quantity often labelled ''k'' as in the ''k''-means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a ...

Detrended correspondence analysis Detrended correspondence analysis (DCA) is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. DCA is f ...

Diffbot Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base. The company has gained interest from its application of computer vision tec ...

Differential evolution In evolutionary computation, differential evolution (DE) is a method that optimizes a problem by iteratively trying to improve a candidate solution with regard to a given measure of quality. Such methods are commonly known as metaheuristics as ...

Discrete phase-type distribution The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stoc ...

Discriminative model Discriminative models, also referred to as conditional models, are a class of logistical models used for classification or regression. They distinguish decision boundaries through observed data, such as pass/fail, win/lose, alive/dead or healthy/si ...

Dissociated press Dissociated press is a parody generator (a computer program that generates nonsensical text). The generated text is based on another text using the Markov chain technique. The name is a play on "Associated Press" and the psychological term dissoc ...

* Distributed R *

Dlib Dlib is a general purpose cross-platform software library written in the programming language C++. Its design is heavily influenced by ideas from design by contract and component-based software engineering. Thus it is, first and foremost, a set ...

Document classification Document classification or document categorization is a problem in library science, information science and computer science. The task is to assign a document to one or more classes or categories. This may be done "manually" (or "intellectually") ...

Documenting Hate Documenting Hate is a project of ProPublica, in collaboration with a number of journalistic, academic, and computing organizations, for systematic tracking of hate crimes and bias incidents. It uses an online form to facilitate reporting of incide ...

Domain adaptation Domain adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. For ...

Doubly stochastic model In statistics, a doubly stochastic model is a type of model that can arise in many contexts, but in particular in modelling time-series and stochastic processes. The basic idea for a doubly stochastic model is that an observed random variable is m ...

* Dual-phase evolution *

Dunn index The Dunn index (DI) (introduced by J. C. Dunn in 1974) is a metric for evaluating clustering algorithms. This is part of a group of validity indices including the Davies–Bouldin index or Silhouette index, in that it is an internal evaluation sch ...

Dynamic Bayesian network A Dynamic Bayesian Network (DBN) is a Bayesian network (BN) which relates variables to each other over adjacent time steps. This is often called a ''Two-Timeslice'' BN (2TBN) because it says that at any point in time T, the value of a variable c ...

* Dynamic Markov compression * Dynamic topic model * Dynamic unobserved effects model *

EDLUT EDLUT (Event-Driven LookUp Table) is a computer application for simulating networks of spiking neurons. It was developed in the University of Granada and source code was released under GNU GPL version 3. EDLUT uses event-driven simulation scheme ...

ELKI ELKI (for ''Environment for DeveLoping KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally at the database s ...

Edge recombination operator The edge recombination operator (ERO) is an operator that creates a path that is similar to a set of existing paths (parents) by looking at the edges rather than the vertices. The main application of this is for crossover in genetic algorithms whe ...

Effective fitness In natural evolution and artificial evolution (e.g. artificial life and evolutionary computation) the fitness (or performance or objective measure) of a schema is rescaled to give its effective fitness which takes into account crossover and mu ...

Elastic map Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are a system of elastic springs embedded in the data space. This system approximates a low-dimensional manifold. The elastic coefficients of this s ...

Elastic matching Elastic matching is one of the pattern recognition techniques in computer science. Elastic matching (EM) is also known as deformable template, flexible matching, or nonlinear template matching. Elastic matching can be defined as an optimization pro ...

Elbow method (clustering) In cluster analysis, the elbow method is a heuristic used in determining the number of clusters in a data set. The method consists of plotting the explained variation as a function of the number of clusters and picking the elbow of the curve as th ...

Emergent (software) Emergent (formerly PDP++) is neural simulation software that is primarily intended for creating models of the brain and cognitive processes. Development initially began in 1995 at Carnegie Mellon University, and , continues at the University of ...

Encog Encog is a machine learning framework available for Java and .Net.J. Heaton http://www.jmlr.org/papers/volume16/heaton15a/heaton15a.pdf Encog: Library of Interchangeable Machine Learning Models for Java and C# Encog supports different learning al ...

Entropy rate In the mathematical theory of probability, the entropy rate or source information rate of a stochastic process is, informally, the time density of the average information in a stochastic process. For stochastic processes with a countable index, the ...

Erkki Oja Erkki Oja (born 22 March 1948 in Helsinki) is a Finland, Finnish computer scientist and Aalto Distinguished Professor in the Department of Information and Computer Science at Aalto University School of Science. He is recognized for developing Oja' ...

* Eurisko *

European Conference on Artificial Intelligence The biennial European Conference on Artificial Intelligence (ECAI) is the leading conference in the field of Artificial Intelligence in Europe, and is commonly listed together with IJCAI and AAAI as one of the three major general AI conferences worl ...

Evaluation of binary classifiers The evaluation of binary classifiers compares two methods of assigning a binary attribute, one of which is usually a standard method and the other is being investigated. There are many metrics that can be used to measure the performance of a clas ...

Evolution strategy In computer science, an evolution strategy (ES) is an optimization technique based on ideas of evolution. It belongs to the general class of evolutionary computation or artificial evolution methodologies. History The 'evolution strategy' optimizat ...

* Evolution window * Evolutionary Algorithm for Landmark Detection *

Evolutionary algorithm In computational intelligence (CI), an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduc ...

Evolutionary art Evolutionary art is a branch of generative art, in which the artist does not do the work of constructing the artwork, but rather lets a system do the construction. In evolutionary art, initially generated art is put through an iterated process o ...

Evolutionary music Evolutionary music is the audio counterpart to evolutionary art, whereby algorithmic music is created using an evolutionary algorithm. The process begins with a population of individuals which by some means or other produce audio (e.g. a piece, m ...

Evolutionary programming Evolutionary programming is one of the four major evolutionary algorithm paradigms. It is similar to genetic programming, but the structure of the program to be optimized is fixed, while its numerical parameters are allowed to evolve. It was first ...

Evolvability (computer science) The term evolvability is used for a recent framework of computational learning introduced by Leslie Valiant in his paper of the same name and described below. The aim of this theory is to model biological evolution and categorize which types of mec ...

Evolved antenna In radio communications, an evolved antenna is an Antenna (radio), antenna designed fully or substantially by an automatic computer design program that uses an Evolutionary computing, evolutionary algorithm that mimics Darwinism, Darwinian evolu ...

Evolver (software) Evolver is a software package that allows users to solve a wide variety of optimization problem In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions ...

* Evolving classification function *

Expectation propagation Expectation propagation (EP) is a technique in Bayesian machine learning. EP finds approximations to a probability distribution. It uses an iterative approach that uses the factorization structure of the target distribution. It differs from oth ...

Exploratory factor analysis In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify ...

F1 score In statistical analysis of binary classification, the F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precision is the number of true positive results divided by the nu ...

* FLAME clustering *

Factor analysis of mixed data In statistics, factor analysis of mixed data or factorial analysis of mixed data (FAMD, in the French original: ''AFDM'' or ''Analyse Factorielle de Données Mixtes''), is the factorial method devoted to data tables in which a group of individuals ...

Factor graph A factor graph is a bipartite graph representing the factorization of a function. In probability theory and its applications, factor graphs are used to represent factorization of a probability distribution function, enabling efficient computatio ...

* Factor regression model *

Factored language model The factored language model (FLM) is an extension of a conventional language model introduced by Jeff Bilmes and Katrin Kirchoff in 2003. In an FLM, each word is viewed as a vector of ''k'' factors: w_i = \. An FLM provides the probabilistic mode ...

Farthest-first traversal In computational geometry, the farthest-first traversal of a compact metric space is a sequence of points in the space, where the first point is selected arbitrarily and each successive point is as far as possible from the set of previously-selec ...

Fast-and-frugal trees In the study of decision-making, a fast-and-frugal tree is a simple graphical structure that categorizes objects by asking one question at a time. These decision trees are used in a range of fields: psychology, artificial intelligence, and managem ...

Feature Selection Toolbox Feature Selection Toolbox (FST) is software primarily for feature selection in the machine learning domain, written in C++, developed at the Institute of Information Theory and Automation (UTIA), of the Czech Academy of Sciences. Version 1 Th ...

* Feature hashing *

Feature scaling Feature scaling is a method used to normalize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data preprocessing step. Motivation Since the ...

Feature vector In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern r ...

Firefly algorithm In mathematical optimization, the firefly algorithm is a metaheuristic proposed by Xin-She Yang and inspired by the flashing behavior of fireflies. Algorithm In pseudocode the algorithm can be stated as: Begin 1) Objective function: ...

* First-difference estimator * First-order inductive learner *

Fish School Search Fish School Search (FSS), proposed by Bastos Filho and Lima Neto in 2008 is, in its basic version, an unimodal optimization algorithm inspired on the collective behavior of fish schools. The mechanisms of feeding and coordinated movement were used a ...

Fisher kernel In statistical classification, the Fisher kernel, named after Ronald Fisher, is a function that measures the similarity of two objects on the basis of sets of measurements for each object and a statistical model. In a classification procedure, the ...

Fitness approximation Fitness approximationY. JinA comprehensive survey of fitness approximation in evolutionary computation ''Soft Computing'', 9:3–12, 2005 aims to approximate the objective or fitness functions in evolutionary optimization by building up machine l ...

Fitness function {{no footnotes, date=May 2015 A fitness function is a particular type of objective function that is used to summarise, as a single figure of merit, how close a given design solution is to achieving the set aims. Fitness functions are used in genetic ...

Fitness proportionate selection Fitness proportionate selection, also known as roulette wheel selection, is a genetic operator used in genetic algorithms for selecting potentially useful solutions for recombination. In fitness proportionate selection, as in all selection methods ...

Fluentd Fluentd is a cross platform open-source data collection software project originally developed at Treasure Data. It is written primarily in the Ruby programming language. Overview Fluentd was positioned for " big data", semi- or un-structured ...

Folding@home Folding@home (FAH or F@h) is a volunteer computing project aimed to help scientists develop new therapeutics for a variety of diseases by the means of simulating protein dynamics. This includes the process of protein folding and the movements ...

* Formal concept analysis * Forward algorithm *

Fowlkes–Mallows index The Fowlkes–Mallows index is an external evaluation method that is used to determine the similarity between two clusterings (clusters obtained after a clustering algorithm), and also a metric to measure confusion matrices. This measure of simi ...

Frederick Jelinek Frederick Jelinek (18 November 1932 – 14 September 2010) was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire a ...

* Frrole * Functional principal component analysis * GATTO *

GLIMMER In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding g ...

Gary Bryce Fogel Gary Bryce Fogel (born 1968) is an American biologist and computer scientist. He is the Chief Executive Officer of Natural Selection, Inc. He is most known for his applications of computational intelligence and machine learning to bioinformatics, ...

Gaussian adaptation Gaussian adaptation (GA), also called normal or natural adaptation (NA) is an evolutionary algorithm designed for the maximization of manufacturing yield due to statistical deviation of component values of signal processing systems. In short, GA ...

Gaussian process In probability theory and statistics, a Gaussian process is a stochastic process (a collection of random variables indexed by time or space), such that every finite collection of those random variables has a multivariate normal distribution, i.e. e ...

Gaussian process emulator In statistics, Gaussian process emulator is one name for a general type of statistical model that has been used in contexts where the problem is to make maximum use of the outputs of a complicated (often non-random) computer-based simulation model. ...

Gene prediction In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functiona ...

General Architecture for Text Engineering General Architecture for Text Engineering or GATE is a Java suite of tools originally developed at the University of Sheffield beginning in 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for many nat ...

Generalization error For supervised learning applications in machine learning and statistical learning theory, generalization error (also known as the out-of-sample error or the risk) is a measure of how accurately an algorithm is able to predict outcome values for pre ...

Generalized canonical correlation In statistics, the generalized canonical correlation In statistics, canonical-correlation analysis (CCA), also called canonical variates analysis, is a way of inferring information from cross-covariance matrices. If we have two vectors ''X''&n ...

Generalized filtering Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" a ...

Generalized iterative scaling In statistics, generalized iterative scaling (GIS) and improved iterative scaling (IIS) are two early algorithms used to fit log-linear models, notably multinomial logistic regression (MaxEnt) classifiers and extensions of it such as MaxEnt Marko ...

Generalized multidimensional scaling Generalized multidimensional scaling (GMDS) is an extension of metric multidimensional scaling, in which the target space is non-Euclidean. When the dissimilarities are distances on a surface and the target space is another surface, GMDS allows find ...

Generative adversarial network A generative adversarial network (GAN) is a class of machine learning frameworks designed by Ian Goodfellow and his colleagues in June 2014. Two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is a ...

Generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsis ...

Genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gene ...

* Genetic algorithm scheduling * Genetic algorithms in economics *

Genetic fuzzy systems {{Evolutionary algorithms In computer science and operations research, Genetic fuzzy systems are fuzzy systems constructed by using genetic algorithms or genetic programming, which mimic the process of natural evolution, to identify its structure an ...

* Genetic memory (computer science) *

Genetic operator A genetic operator is an operator used in genetic algorithms to guide the algorithm towards a solution to a given problem. There are three main types of operators (mutation, crossover and selection), which must work in conjunction with one anothe ...

Genetic programming In artificial intelligence, genetic programming (GP) is a technique of evolving programs, starting from a population of unfit (usually random) programs, fit for a particular task by applying operations analogous to natural genetic processes to t ...

Genetic representation In computer programming, genetic representation is a way of presenting solutions/individuals in evolutionary computation methods. Genetic representation can encode appearance, behavior, physical qualities of individuals. Designing a good genetic r ...

Geographical cluster A geographical cluster is a localized anomaly, usually an excess of something given the distribution or variation of something else. Often it is considered as an incidence rate In epidemiology, incidence is a measure of the probability of oc ...

Gesture Description Language Gesture Description Language (GDL or GDL Technology) is a method of describing and automatic (computer) syntactic classification of gestures and movements created by doctor Tomasz Hachaj (PhD) and professor Marek R. Ogiela(PhD, DSc). GDL uses con ...

Geworkbench geWorkbench (genomics Workbench) is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java (programming language), Java. geWorkbench uses a component architect ...

Glossary of artificial intelligence This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary o ...

Glottochronology Glottochronology (from Attic Greek γλῶττα ''tongue, language'' and χρόνος ''time'') is the part of lexicostatistics which involves comparative linguistics and deals with the chronological relationship between languages.Sheila Embleton ( ...

Golem (ILP) Golem is an inductive logic programming algorithm developed by Stephen Muggleton and Cao Feng in 1990. It uses the technique of Inductive logic programming#Least general generalisation, relative least general generalisation proposed by Gordon Plotki ...

Google matrix A Google matrix is a particular stochastic matrix that is used by Google's PageRank algorithm. The matrix represents a graph with edges representing links between pages. The PageRank of each page can then be generated iteratively from the Google ...

Grafting (decision trees) Grafting is the process of adding nodes to inferred decision trees to improve the predictive accuracy. A decision tree is a graphical model that is used as a support tool for decision process. Introduction Once the decision tree is constructed, th ...

Gramian matrix In linear algebra, the Gram matrix (or Gramian matrix, Gramian) of a set of vectors v_1,\dots, v_n in an inner product space is the Hermitian matrix of inner products, whose entries are given by the inner product G_ = \left\langle v_i, v_j \right\r ...

* Grammatical evolution *

Granular computing Granular computing (GrC) is an emerging computing paradigm of information processing that concerns the processing of complex information entities called "information granules", which arise in the process of data abstraction and derivation of knowl ...

GraphLab Turi is a graph-based, high performance, distributed computation framework written in C++. The GraphLab project was started by Prof. Carlos Guestrin of Carnegie Mellon University in 2009. It is an open source project using an Apache License. ...

Graph kernel In structure mining, a graph kernel is a kernel function that computes an inner product on graphs. Graph kernels can be intuitively understood as functions measuring the similarity of pairs of graphs. They allow kernelized learning algorithms su ...

Gremlin (programming language) Gremlin is a graph traversal language and virtual machine developed by Apache TinkerPop of the Apache Software Foundation. Gremlin works for both OLTP-based graph databases as well as OLAP-based graph processors. Gremlin's automata and functional ...

Growth function The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family. It is especially used in the context of statistical learning theory, where it measures the complexity of a hypothesis class. Th ...

* HUMANT (HUManoid ANT) algorithm *

Hammersley–Clifford theorem The Hammersley–Clifford theorem is a result in probability theory, mathematical statistics and statistical mechanics that gives necessary and sufficient conditions under which a strictly positive probability distribution (of events in a probabilit ...

Harmony search This is a chronologically ordered list of metaphor-based metaheuristics and swarm intelligence algorithms, sorted by decade of proposal. Algorithms 1980s-1990s Simulated annealing (Kirkpatrick et al., 1983) Simulated annealing is a pr ...

Hebbian theory Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptatio ...

* Hidden Markov random field *

Hidden semi-Markov model A hidden semi-Markov model (HSMM) is a statistical model with the same structure as a hidden Markov model except that the unobservable process is semi-Markov rather than Markov Markov ( Bulgarian, russian: Марков), Markova, and Markoff are c ...

* Higher-order factor analysis *

Highway network In machine learning, the Highway Network was the first working very deep feedforward neural network with hundreds of layers, much deeper than previous artificial neural networks. It uses skip connections modulated by learned gating mechanisms to ...

Hinge loss In machine learning, the hinge loss is a loss function used for training classifiers. The hinge loss is used for "maximum-margin" classification, most notably for support vector machines (SVMs). For an intended output and a classifier score , th ...

Holland's schema theorem Holland's schema theorem, also called the fundamental theorem of genetic algorithms, is an inequality that results from coarse-graining an equation for evolutionary dynamics. The Schema Theorem says that short, low-order schemata with above-average ...

Hopkins statistic The Hopkins statistic (introduced by Brian Hopkins and John Gordon Skellam) is a way of measuring the cluster tendency of a data set. It belongs to the family of sparse sampling tests. It acts as a statistical hypothesis test where the null hypoth ...

Hoshen–Kopelman algorithm The Hoshen–Kopelman algorithm is a simple and efficient algorithm for labeling clusters on a grid, where the grid is a regular network of cells, with the cells being either occupied or unoccupied. This algorithm is based on a well-known union- ...

Huber loss Huber is a German-language surname. It derives from the German word ''Hube'' meaning hide, a unit of land a farmer might possess, granting them the status of a free tenant. It is in the top ten most common surnames in the German-speaking world, ...

IRCF360 Infrared Control Freak 360 (IRCF360) is a 360-degree proximity sensor and a motion sensing devices, developed by ROBOTmaker. The sensor is in BETA developers release as a low cost (software configurable) sensor for use within research, technical ...

* Ilastik *

Ilya Sutskever Ilya Sutskever is a computer scientist working in machine learning, who co-founded and serves as Chief Scientist of OpenAI. He has made several major contributions to the field of deep learning. He is the co-inventor, with Alex Krizhevsky and Ge ...

Immunocomputing In academia, computational immunology is a field of science that encompasses high-throughput genomic and bioinformatics approaches to immunology. The field's main aim is to convert immunological data into computational problems, solve these problem ...

Imperialist competitive algorithm In computer science, imperialist competitive algorithms are a type of computational method used to solve optimization problems of different types. Like most of the methods in the area of evolutionary computation, ICA does not need the gradient of ...

* Inauthentic text * Incremental decision tree * Induction of regular languages *

Inductive probability Inductive probability attempts to give the probability of future events based on past events. It is the basis for inductive reasoning, and gives the mathematical basis for learning and the perception of patterns. It is a source of knowledge about t ...

Inductive programming Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative (logic or functional) and often recursive programs from incom ...

Influence diagram Influence or influencer may refer to: *Social influence, in social psychology, influence in interpersonal relationships **Minority influence, when the minority affect the behavior or beliefs of the majority *Influencer marketing, through individu ...

Information Harvesting Information Harvesting (IH) was an early data mining product from the 1990s. It was invented by Ralphe Wiggins and produced by the Ryan Corp, later Information Harvesting Inc., of Cambridge, Massachusetts. Wiggins had a background in genetic alg ...

* Information fuzzy networks *

Information gain in decision trees In information theory and machine learning, information gain is a synonym for ''Kullback–Leibler divergence''; the amount of information gained about a random variable or signal from observing another random variable. However, in the context of ...

* Information gain ratio * Inheritance (genetic algorithm) * Instance selection *

Intel RealSense Intel RealSense Technology is a product range of depth and tracking technologies designed to give machines and devices depth perception capabilities. The technologies, owned by Intel are used in autonomous drones, robots, AR/VR, smart home device ...

Interacting particle system In probability theory, an interacting particle system (IPS) is a stochastic process (X(t))_ on some configuration space \Omega= S^G given by a site space, a countable-infinite graph G and a local state space, a compact metric space S . More ...

Interactive machine translation Interactive machine translation (IMT), is a specific sub-field of computer-aided translation. Under this translation paradigm, the computer software that assists the human translator attempts to predict the text the user is going to input by taking ...

International Joint Conference on Artificial Intelligence The International Joint Conference on Artificial Intelligence (IJCAI) is the leading conference in the field of Artificial Intelligence. The conference series has been organized by the nonprofit IJCAI Organization since 1969, making it the oldest p ...

International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics The International Conference on Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB) is a yearly scientific conference focused on machine learning and computational intelligence applied to bioinformatics and biostatis ...

International Semantic Web Conference The International Semantic Web Conference (ISWC) is a series of academic conferences and the premier international forum for the Semantic Web, Linked Data and Knowledge Graph Community. Here, scientists, industry specialists, and practitioners ...

Iris flower data set The ''Iris'' flower data set or Fisher's ''Iris'' data set is a multivariate data set used and made famous by the British statistician and biologist Ronald Fisher in his 1936 paper ''The use of multiple measurements in taxonomic problems'' as an ...

Island algorithm The island algorithm is an algorithm for performing inference on hidden Markov models, or their generalization, dynamic Bayesian networks. It calculates the marginal distribution for each unobserved node, conditional on any observed nodes. The is ...

Isotropic position In the fields of machine learning, the theory of computation, and random matrix theory, a probability distribution over vectors is said to be in isotropic position if its covariance matrix is equal to the identity matrix. Formal definitions Le ...

Item response theory In psychometrics, item response theory (IRT) (also known as latent trait theory, strong true score theory, or modern mental test theory) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring ...

Iterative Viterbi decoding Iterative Viterbi decoding is an algorithm that spots the subsequence ''S'' of an observation ''O'' = having the highest average probability (i.e., probability scaled by the length of ''S'') of being generated by a given hidden Markov model ''M'' w ...

JOONE JOONE (Java Object Oriented Neural Engine) is a component based neural network framework built in Java (programming language), Java. Features Joone consists of a component-based architecture based on linkable components that can be extended to b ...

Jabberwacky Jabberwacky is a chatterbot created by British programmer Rollo Carpenter. Its stated aim is to "simulate natural human chat in an interesting, entertaining and humorous manner". It is an early attempt at creating an artificial intelligence th ...

Jaccard index The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for gauging the similarity and diversity of sample sets. It was developed by Grove Karl Gilbert in 1884 as his ratio of verification (v) and now is freque ...

* Jackknife variance estimates for random forest * Java Grammatical Evolution *

Joseph Nechvatal Joseph Nechvatal (born January 15, 1951) is an American post-conceptual digital artist and Aesthetics, art theoretician who creates computer-assisted paintings and computer animations, often using custom-created computer viruses. Life and work ...

* Jubatus *

Julia (programming language) Julia is a high-level, dynamic programming language. Its features are well suited for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism in a dynamic programmi ...

Junction tree algorithm The junction tree algorithm (also known as 'Clique Tree') is a method used in machine learning to extract marginalization in general graphs. In essence, it entails performing belief propagation on a modified graph called a junction tree. The gra ...

K-SVD In applied mathematics, K-SVD is a dictionary learning algorithm for creating a dictionary for sparse representations, via a singular value decomposition approach. K-SVD is a generalization of the k-means clustering method, and it works by iterati ...

* K-means++ *

K-medians clustering In statistics, ''k''-medians clusteringP. S. Bradley, O. L. Mangasarian, and W. N. Street, "Clustering via Concave Minimization," in Advances in Neural Information Processing Systems, vol. 9, M. C. Mozer, M. I. Jordan, and T. Petsche, Eds. Cambrid ...

K-medoids The -medoids problem is a clustering problem similar to -means. The name was coined by Leonard Kaufman and Peter J. Rousseeuw with their PAM algorithm. Both the -means and -medoids algorithms are partitional (breaking the dataset up into group ...

KNIME KNIME (), the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks ...

KXEN Inc. KXEN was an American software company which existed from 1998 to 2013 when it was acquired by SAP AG. History KXEN was founded in June 1998 by Roger Haddad and Michel Bera. It was based in San Francisco, California with offices in Paris and L ...

K q-flats In data mining and machine learning, -flats algorithm is an iterative method which aims to partition observations into clusters where each cluster is close to a -flat, where is a given integer. It is a generalization of the -means algorithm ...

* Kaggle * Kalman filter * Katz's back-off model * Kernel adaptive filter * Kernel density estimation * Kernel eigenvoice * Kernel embedding of distributions * Kernel method * Kernel perceptron * Kernel random forest * Kinect * Klaus-Robert Müller * Kneser–Ney smoothing * Knowledge Vault * Knowledge integration * LIBSVM * LPBoost * Labeled data * LanguageWare * Language identification in the limit * Language model * Large margin nearest neighbor * Latent Dirichlet allocation * Latent class model * Latent semantic analysis * Latent variable * Latent variable model * Lattice Miner * Layered hidden Markov model * Learnable function class * Least squares support vector machine * Leave-one-out error * Leslie P. Kaelbling * Linear genetic programming * Linear predictor function * Linear separability * Lingyun Gu * Linkurious * Lior Ron (business executive) * List of genetic algorithm applications * List of metaphor-based metaheuristics * List of text mining software * Local case-control sampling * Local independence * Local tangent space alignment * Locality-sensitive hashing * Log-linear model * Logistic model tree * Low-rank approximation * Low-rank matrix approximations * MATLAB * MIMIC (immunology) * MXNet * Mallet (software project) * Manifold regularization * Margin-infused relaxed algorithm * Margin classifier * Mark V. Shaney * Massive Online Analysis * Matrix regularization * Matthews correlation coefficient * Mean shift *

* Measurement invariance * Medoid * MeeMix * Melomics * Memetic algorithm * Meta-optimization * Mexican International Conference on Artificial Intelligence * Michael Kearns (computer scientist) * MinHash * Mixture model * Mlpy * Models of DNA evolution * Moral graph * Mountain car problem * Movidius *

* Multi expression programming * Multiclass classification * Multidimensional analysis * Multifactor dimensionality reduction * Multilinear principal component analysis * Multiple correspondence analysis * Multiple discriminant analysis * Multiple factor analysis * Multiple sequence alignment * Multiplicative weight update method * Multispectral pattern recognition * Mutation (genetic algorithm) * MysteryVibe * N-gram * NOMINATE (scaling method) * Native-language identification * Natural Language Toolkit * Natural evolution strategy * Nearest-neighbor chain algorithm * Nearest centroid classifier * Nearest neighbor search * Neighbor joining * Nest Labs * NetMiner * NetOwl * Neural Designer * Neural Engineering Object * Neural Lab * Neural modeling fields * Neural network software * NeuroSolutions * Neuro Laboratory * Neuroevolution * Neuroph * Niki.ai * Noisy channel model * Noisy text analytics * Nonlinear dimensionality reduction * Novelty detection * Nuisance variable * One-class classification * Onnx * OpenNLP * Optimal discriminant analysis * Oracle Data Mining * Orange (software) * Ordination (statistics) * Overfitting * PROGOL * PSIPRED * Pachinko allocation * PageRank * Parallel metaheuristic * Parity benchmark * Part-of-speech tagging * Particle swarm optimization * Path dependence * Pattern language (formal languages) * Peltarion Synapse * Perplexity * Persian Speech Corpus * Picas (app) * Pietro Perona * Pipeline Pilot * Piranha (software) * Pitman–Yor process * Plate notation * Polynomial kernel * Pop music automation * Population process * Portable Format for Analytics * Predictive Model Markup Language * Predictive state representation * Preference regression * Premature convergence * Principal geodesic analysis * Prior knowledge for pattern recognition * Prisma (app) * Probabilistic Action Cores * Probabilistic context-free grammar * Probabilistic latent semantic analysis * Probabilistic soft logic * Probability matching * Probit model * Product of experts * Programming with Big Data in R * Proper generalized decomposition * Pruning (decision trees) * Pushpak Bhattacharyya * Q methodology * Qloo * Quality control and genetic algorithms * Quantum Artificial Intelligence Lab * Queueing theory * Quick, Draw! * R (programming language) * Rada Mihalcea * Rademacher complexity * Radial basis function kernel * Rand index * Random indexing * Random projection * Random subspace method * Ranking SVM * RapidMiner * Rattle GUI * Raymond Cattell * Reasoning system * Regularization perspectives on support vector machines * Relational data mining * Relationship square * Relevance vector machine * Relief (feature selection) * Renjin * Repertory grid * Representer theorem * Reward-based selection * Richard Zemel * Right to explanation * RoboEarth * Robust principal component analysis * RuleML Symposium * Rule induction * Rules extraction system family * SAS (software) * SNNS * SPSS Modeler * SUBCLU * Sample complexity * Sample exclusion dimension * Santa Fe Trail problem * Savi Technology * Schema (genetic algorithms) * Search-based software engineering * Selection (genetic algorithm) * Self-Service Semantic Suite * Semantic folding * Semantic mapping (statistics) * Semidefinite embedding * Sense Networks * Sensorium Project * Sequence labeling * Sequential minimal optimization * Shattered set * Shogun (toolbox) * Silhouette (clustering) * SimHash * SimRank * Similarity measure * Simple matching coefficient * Simultaneous localization and mapping * Sinkov statistic * Sliced inverse regression * Snakes and Ladders * Soft independent modelling of class analogies * Soft output Viterbi algorithm * Solomonoff's theory of inductive inference * SolveIT Software * Spectral clustering * Spike-and-slab variable selection * Statistical machine translation * Statistical parsing * Statistical semantics * Stefano Soatto * Stephen Wolfram * Stochastic block model * Stochastic cellular automaton * Stochastic diffusion search * Stochastic grammar * Stochastic matrix * Stochastic universal sampling * Stress majorization * String kernel * Structural equation modeling * Structural risk minimization * Structured sparsity regularization * Structured support vector machine * Subclass reachability * Sufficient dimension reduction * Sukhotin's algorithm * Sum of absolute differences * Sum of absolute transformed differences * Swarm intelligence * Switching Kalman filter * Symbolic regression * Synchronous context-free grammar * Syntactic pattern recognition * TD-Gammon * TIMIT * Teaching dimension * Teuvo Kohonen * Textual case-based reasoning * Theory of conjoint measurement * Thomas G. Dietterich * Thurstonian model * Topic model * Tournament selection * Training, test, and validation sets * Transiogram * Trax Image Recognition * Trigram tagger * Truncation selection * Tucker decomposition * UIMA * UPGMA * Ugly duckling theorem * Uncertain data * Uniform convergence in probability * Unique negative dimension * Universal portfolio algorithm * User behavior analytics * VC dimension * VIGRA * Validation set * Vapnik–Chervonenkis theory * Variable-order Bayesian network * Variable kernel density estimation * Variable rules analysis * Variational message passing * Varimax rotation * Vector quantization * Vicarious (company) * Viterbi algorithm * Vowpal Wabbit * WACA clustering algorithm * WPGMA * Ward's method * Weasel program * Whitening transformation * Winnow (algorithm) * Win–stay, lose–switch * Witness set * Wolfram Language * Wolfram Mathematica * Writer invariant * Xgboost * Yooreeka * Zeroth (software)

References

External links

Data Science: Data to Insights from MIT (machine learning)
* Popular online course by

, a
Coursera
It uses GNU Octave. The course is a free version of Stanford University's actual course taught by Ng, see.stanford.edu/Course/CS229 available for free].
mloss
is an academic database of open-source machine learning software. {{Outline footer Outlines of applied sciences, Machine learning Wikipedia outlines, Machine learning Computing-related lists Machine learning, * Data mining, Machine learning